88 results found.
Written
Second language learner corpus,
Language Type:
Multilingual
Languages:
Czech German Italian
Availability:
Publicly available
License:
CC BY-SA 4.0
Size:
2,286 texts Production Status:
Finished
Use:
-
Paper title:Reproducing Monolingual, Multilingual and Cross-Lingual CEFR Predictions
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yves Bestgen | MERLIN corpus | /N |
Documentation:
See https://merlin-platform.eu/index.php
,
Language Type:
Multilingual
Languages:
Czech German Italian
Availability:
Freely Available
License:
<Not Specified>
Size:
2,286 texts Production Status:
Newly created-finished
Use:
resource for language learning and teaching
-
Paper title:Reproduction and Replication: A Case Study with Automatic Essay Scoring
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eva Huber | MERLIN Corpus | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English Italian
Availability:
From Owner
License:
Size:
318725 words Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:DecOp: A Multilingual and Multi-domain Corpus For Detecting Deception In Typed Text
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Pasquale Capuozzo | The DecOp corpus | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German Italian Spanish
Availability:
Freely Available
License:
Creative Commons
Size:
59364453 sentences Production Status:
Newly created-finished
Use:
Word Sense Disambiguation
-
Paper title:Sense-Annotated Corpora for Word Sense Disambiguation in Multiple Languages and Domains
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bianca Scarlini | OneSeC | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Italian
Availability:
Freely Available
License:
Size:
12822 words Production Status:
Existing-updated
Use:
Lexical Normalization
-
Paper title:Norm It! Lexical Normalization for Italian and Its Downstream Effects for Dependency Parsing
-
Paper track:Evaluation/poster presentation with demo
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Rob van der Goot | Norm It | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese Dutch French German Italian Mongolian Persian Russian Spanish Swedish Turkish
Availability:
Freely Available
License:
CC0
Size:
700 hours Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Changhan Wang | CoVoST | /N |
Documentation:
https://github.com/facebookresearch/covost
Written
Corpus,
Language Type:
Multilingual
Languages:
English Esperanto French
German
Portuguese
Spanish
+ 88 others Italian Russian Turkish
Availability:
Freely Available
License:
CC-BY 2.0 FR
Size:
2 789 631 sentences Production Status:
Newly created-finished
Use:
Textual Entailment and Paraphrasing
-
Paper title:TaPaCo: A Corpus of Sentential Paraphrases for 73 Languages
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yves Scherrer | TaPaCo | /N |
Documentation:
the paper itself
Written
Corpus,
Language Type:
Monolingual
Languages:
Italian
Availability:
Downladable from the web trough the provided scripts
License:
Size:
275,000 articles OtherProduction Status:
Newly created-finished
Use:
Natural Language Generation
-
Paper title:Invisible to People but not to Machines: Evaluation of Style-aware HeadlineGeneration in Absence of Reliable Human Judgment
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Lorenzo De Mattei | RepGioDataset | /N |
Documentation:
Documentation available in English
Written
Lexicon,
Language Type:
Multilingual
Languages:
Albanian Arabic Basque Bulgarian Catalan Chinese Croatian Danish Dutch English Finnish French Galician Greek Hebrew Icelandic Indonesian Italian Japanese Lithuanian Malay Norwegian Persian Polish Portuguese Romanian Slovak Slovene Spanish Swedish Thai
Availability:
Freely Available
License:
Multiple Licenses
Size:
1072646 synsets Production Status:
Existing-used
Use:
All of the above
-
Paper title:Some Issues with Building a Multilingual Wordnet
-
Paper track:Infrastructural Issues/Large Projects/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | John P. McCrae | Open Multilingual WordNet | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Italian
Availability:
Freely Available
License:
Size:
4.2M words Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yerai Doval | ItWac | /N |
Documentation:
None




